Tagging Unknown Proper Names Using Decision Trees
نویسندگان
چکیده
This paper describes a supervised learning method to automatically select from a set of noun phrases, embedding proper names of different semantic classes, their most distinctive features. The result of the learning process is a decision tree which classifies an unknown proper name on the basis of its context of occurrence. This classifier is used to estimate the probability distribution of an out of vocabulary proper name over a tagset. This probability distribution is itself used to estimate the parameters of a stochastic part of speech tagger.
منابع مشابه
An Approach to Proper Name Tagging for German
This paper presents an incremental method for the tagging of proper names in German newspaper texts. The tagging is performed by the analysis of the syntactic and textual contexts of proper names together with a morphological analysis. The proper names selected by this process supply new contexts which can be used for finding new proper names, and so on. This procedure was applied to a small Ge...
متن کاملAutomatic Semantic Tagging of Unknown Proper Names
Implemented methods for proper names recognition rely on large gazetteers of common proper nouns and a set of heuristic rules (e.g. Mr. as an indicator of a PERSON entity type). Though the performance of current PN recognizers is very high (over 90%), it is important to note that this problem is by no means a "solved problem". Existing systems perform extremely well on newswire corpora by virtu...
متن کاملProper Nouns Recognition in Arabic Crime Text Using Machine Learning Approach
Named Entity Recognition (NER) identifies proper nouns in a text and categorizes it as a distinct kind of named entities. This function enables the extraction of peoples name, locations, organizations, and currencies. Several research abound in this area in Arabic NER is concerned. However, recognizing Arabic named entities is challenging due to the complexity in the Arabic language. These comp...
متن کاملAmazigh Part-of-speech Tagging Using Markov Models and Decision Trees
The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees. After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger a generic stochastic tagging tool, very popular for its efficiency. We have gathered a working corpus, large enough to ens...
متن کاملA Comparison of Three Machine Learning Methods for Amazigh POS Tagging
Part of speech tagging (POS tagging) has a crucial role in different fields of natural language processing (NLP) including Speech Recognition, Natural Language Parsing, Information Retrieval and Multi Words Term Extraction. This paper describes a set of experiments involving the application of three state-of the-art part-of-speech taggers to Amazigh texts, using a tagset of 28 tags. The taggers...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000